Model Selection

Video Content Analysis

# Video Content Analysis

A Japanese Vision-Language Model (VLM) developed by NABLAS, supporting image, multi-image, and video inputs, suitable for various multimodal tasks.

Transformers Japanese

Video-R1-7B is a multimodal large language model optimized based on Qwen2.5-VL-7B-Instruct, focusing on video reasoning tasks, capable of understanding video content and answering related questions.

Transformers English

Youtube Xlm Roberta Base Sentiment Multilingual

Fine-tuned YouTube comment sentiment analysis model based on cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual, with an accuracy of 80.17%

Text Classification

Ola-7B is a multi-modal language model jointly developed by Tencent, Tsinghua University, and Nanyang Technological University. Based on the Qwen2.5 architecture, it supports text, image, video, and audio inputs, with text content as output.

Safetensors Supports Multiple Languages

Smolvlm2 256M Video Instruct

SmolVLM2-256M-Video is a lightweight multimodal model specifically designed for analyzing video content, capable of processing video, image, and text inputs to generate text outputs.

Transformers English

Eagle2 is a high-performance series of vision-language models focused on enhancing model performance through optimized data strategies and training methods. Eagle2-9B is the large model in this series, achieving a good balance between performance and inference speed.

Transformers Other

KnutJaegersberg

Internvl 2 5 HiCo R64

A video multimodal large language model enhanced by Long and Rich Context (LRC) modeling, improving existing MLLMs by enhancing the perception of fine-grained details and capturing long-term temporal structures

Transformers English

Videomae Large Finetuned Deepfake Subset

A fine-tuned version based on MCG-NJU/videomae-large model on the deepfake detection challenge dataset, used for video deepfake detection.

Video Processing

Mplug Owl3 2B 241014

mPLUG-Owl3 is an advanced multimodal large language model focused on addressing the challenges of long image sequence understanding, significantly improving processing speed and sequence length through the Hyper Attention mechanism.

Text-to-Image English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase